07 - OpenCV (In progress)

name: portada
layout: true
class: portada-slide, middle, right
---

# Conceptes generals de la Intel·ligència Artificial
## Introducció al Machine Learning

.footnote[Joan Puigcerver]

---
layout: true
class: regular-slide
.right[.logo[![logoMislata](/itb/images/logo_mislata.png)]]

---
# Índex
## .blue[0.] Instal·lació
## .blue[1.] Operacions bàsiques
## .blue[2.] Transformacions bàsiques
## .blue[3.] Dibuixar
## .blue[4.] Combinació d'imatges

---
# .blue[0.] Instal·lació
Instal·lar el paquet [opencv-python](https://pypi.org/project/opencv-python/):
```bash
pip install opencv-python
```

Per fer-lo servir cal importar-lo:
```
#!/usr/bin/env python3

import cv2 as cv

# Codi
# ...
```
---
# .blue[1.] Operacions bàsiques
## Llegir imatges
```
#!/usr/bin/env python3
import cv2 as cv

# Llegir una imatge des de disc
img = cv.imread("ruta/fitxer")

# Mostrar imatge
cv.imshow("Títol", img)

# Espera X milisegons (0 per esperar indifinidament) per entrar una tecla per teclat
# S'utilitza per no acabar el programa i poder veure la finestra
# IMPORTANT! Donar-li a qualsevol tecla per acabar el progrma
cv.waitKey(0)
```
---
# .blue[1.] Operacions bàsiques
## Llegir vídeo
```
#!/usr/bin/env python3

import cv2 as cv

# Llegir vídeo
# 0, 1, 2, .... Índex del dispositiu si volem capturar (webcam)
capture = cv.VideoCapture(0)

# Video des de disc
capture = cv.VideoCapture("path/to/file")

while True:
    isTrue, frame = capture.read()

cv.imshow("Video", frame)

# S'espera 20 milisegons abans de mostrar el següent frame
    # Si es polsa la tecla 'd' acaba el bucle i el programa
    if cv.waitKey(20) & 0xFF==ord('d'):
        break

capture.release()
cv.destroyAllWindows()
```
---
# .blue[2.] Transformacions bàsiques
## Redimensionar (.blue[resize])
* cv.resize(img, dimensions, interpolation)
  * __img__: Imatge o frame.
  * __dimensions__: Tupla amb les dimensions (width, height), en píxels.
  * __interpolation__: Tipus d'interpolació ([docs](https://docs.opencv.org/3.4/da/d54/group__imgproc__transform.html))

```
img = cv.imread("path/to/file")
resized = cv.resize(img, (300, 200), cv.INTER_AREA)
cv.imshow("Redimensionada", resized)
```

Aquest mètode redimensiona exactament a la mida indicada a __dimensions__, sense
tindre en compte la [relació d'aspecte](https://ca.wikipedia.org/wiki/Relaci%C3%B3_d%27aspecte) (__aspect ratio__).

---
# .blue[2.] Transformacions bàsiques
## Redimensionar (.blue[resize])
Per mantindre la relació d'aspecte ens podem definir un mètode com el següent:

* resize(img, factor, interpolation)
  * __img__: Imatge o frame.
  * __factor__: Factor pel qual redimensionem la mida de __img__. 
  Per exemple, factor=0.5 redimensiona l'imatge a la meitat.
  * __interpolation__: Tipus d'interpolació ([docs](https://docs.opencv.org/3.4/da/d54/group__imgproc__transform.html))

```
def resize(img, factor, interpolation=cv.INTER_AREA):
    (height,width) = img.shape[:2]
    dimensions = (int(width*factor), int(height*factor))
    return cv.resize(img, dimensions, interpolation)

img = cv.imread("path/to/file")
resized = resize(img, 0.8)
cv.imshow("Redimensionada", resized)
```
---
# .blue[2.] Transformacions bàsiques
## Transladar (.blue[translate])
Transformació utilitzant multiplicacions de matrius amb numpy. ([Wiki](https://en.wikipedia.org/wiki/Translation_(geometry))

* translate(img, x, y):
  * __img__: Imatge o frame
  * __x__: Moviment horitzontal en píxels (positiu: dreta, negatiu: esquerra)
  * __y__: Moviment vertical en píxels (positiu: baix, negatiu: dalt)

```python
def translate(img, x, y):
    transMat = np.float32([[1, 0, x], [0, 1, y]])
    dimensions = (img.shape[1], img.shape[0])
    return cv.warpAffine(img, transMat, dimensions)

img = cv.imread("path/to/file")
translated = translate(img, 100, 100)
cv.imshow("Transladada", translated)
```

---
# .blue[2.] Transformacions bàsiques
## Rotar (.blue[rotate])
Transformació utilitzant multiplicacions de matrius amb numpy. ([Wiki](https://ca.wikipedia.org/wiki/Matriu_de_rotaci%C3%B3))
* rotate(img, angle, rotPoint):
  * __img__: Imatge o frame
  * __angle__: Angle en graus (negatiu:clockwise, postiu:counterclockwise)
  * __rotPoint__: Punt de rotació, per defecte el centre de la imatge.

```python
def rotate(img, angle, rotPoint=None):
    (height,width) = img.shape[:2]

if rotPoint is None:
        rotPoint = (width//2,height//2)
    
    rotMat = cv.getRotationMatrix2D(rotPoint, angle, 1.0)
    return cv.warpAffine(img, rotMat, (width, height))

img = cv.imread("path/to/file")
rotated = rotate(img, -45)
cv.imshow("Rotada", rotated)
```
---
# .blue[2.] Transformacions bàsiques
## Donar la volta (.blue[flip])
* cv.flip(img, flipCode): [Docs](https://docs.opencv.org/3.4/d2/de8/group__core__array.html#gaca7be533e3dac7feb70fc60635adf441)
* __img__: Imatge o frame
* __flipCode__: Codi. {0: x-axis, 1: y-axis, -1: both}

```
img = cv.imread("path/to/file")
flipped = cv.flip(img, 0)
cv.imshow("Flipped", flipped")
```

---
# .blue[2.] Transformacions bàsiques
## Retallada (.blue[crop])
Operacions de __slicing__ d'arrays en Python. Ens podriem definir la següent mètode:
* crop(img, p1, p2):
  * __img__: Imatge o frame
  * __p1__: Punt de dalt a l'esquerra (top-left point)
  * __p2__: Punt de baix a la dreta (bottom-right point)

```python
def crop(img, p1, p2):
    return img[p1[0]:p2[0], p1[1]:p2[1]]

img = cv.imread("path/to/file")
# Aquestes línies són equivalents
cropped = crop(img, (100, 150), (200, 400))
cropped = img[100:200, 150:400]
cv.imshow("Cropped", cropped)
```
---
# .blue[3.] Dibuixar
## Rectangle
* cv.rectangle(img, p1, p2, color, thikness=1):
  * __img__: Imatge o frame
  * __p1__: Tupla. Punt de dalt a l'esquerra (top-left point)
  * __p2__: Tupla. Punt de baix a la dreta (bottom-right point)
  * __color__: Color. Ha de coincidir en el tipus de la imatge (per defecte BGR)
  * __thikness__: Grossor de la línia en píxels. Amb valor -1 ho omple.

```python
img = cv.imread("path/to/file")

rectangle = cv.rectangle(img, (50, 50), (200, 100), (0, 0, 255), thikness=-1)

cv.imshow("Rectangle", rectangle)
```

---
# .blue[3.] Dibuixar
## Cercle
* cv.circle(img, centre, radi, color, thikness=1):
  * __img__: Imatge o frame
  * __centre__: Tupla. Punt de dalt a l'esquerra (top-left point)
  * __radi__: Radi en píxels.
  * __color__: Color. Ha de coincidir en el tipus de la imatge (per defecte BGR)
  * __thikness__: Grossor de la línia en píxels. Amb valor -1 ho omple.

```python
img = cv.imread("path/to/file")

circle = cv.circle(img, (50, 50), 100, (0, 0, 255), thikness=-1)

cv.imshow("Circle", circle)
```

---
# .blue[3.] Dibuixar
## Línia
* cv.line(img, p1, p2, color, thikness=1):
  * __img__: Imatge o frame
  * __p1__: Tupla. Punt inicial.
  * __p2__: Tupla. Punt final.
  * __color__: Color. Ha de coincidir en el tipus de la imatge (per defecte BGR)
  * __thikness__: Grossor de la línia en píxels. Amb valor -1 ho omple.

```python
img = cv.imread("path/to/file")

line = cv.line(img, (50, 50), (100, 100), (0, 0, 255), thikness=-1)

cv.imshow("Line", line)
```

---
# .blue[3.] Dibuixar
## Text
* cv.putText(img, text, p1, font, fontScale, color, thikness=1):
  * __img__: Imatge o frame
  * __p1__: Tupla. Punt de baix a l'esquerra del text.
  * __font__: Font del text.
  * __fontScale__: Factor d'escala que es multiplica al tamany base de la font.
  * __color__: Color. Ha de coincidir en el tipus de la imatge (per defecte BGR)
  * __thikness__: Grossor de la línia en píxels. Amb valor -1 ho omple.

```python
img = cv.imread("path/to/file")

text = cv.putText(img, 'text', (455, 450), cv.FONT_HERSHEY_TRIPLEX, 1.0, (0,0,0), thickness=2)

cv.imshow("Text", text)
```

---
# .blue[4.] Combinació d'imatges
## Màscara (.blue[mask])
La màscara és una operació de bits que s'utilitza per sel·leccionar l'àrea sobre la qual s'aplicira una transformació.

```python
img = cv.imread("path/to/file")
mask = np.zeros(img.shape[:2], dytpe='uint8')
mask = cv.rectangle(mask, (0, 90), (290, 450), 255, -1)
cv.imshow("Rectangular Mask", mask)

masked = cv.bitwise_and(img, img, mask=mask)
cv.imshow("Masked", masked)
```
???
https://www.pyimagesearch.com/2021/01/19/image-masking-with-opencv/

---
# .blue[4.] Combinació d'imatges
## Concatenar (.blue[concatenate])

```python
img1 = cv.imread("path/to/file")
img2 = cv.imread("path/to/file")

# Verticalment
concatenated_v = np.concatenate((img1, img2), axis=0)

# Horitzontalment
concatenated_h = np.concatenate((img1, img2), axis=1)

cv.imshow("Horitzontal", concatenated_h)
cv.imshow("Vertical", concatenated_v)
```

---
# .blue[4.] Combinació d'imatges
## Superposició (.blue[overlay]) i blending

Operacions de __slicing__ d'arrays en Python + `cv.addWeighted`
* overlay(img, img2, x, y, alpha=1):
  * __img__: Imatge base
  * __img2__: Imatge que es superposarà
  * __x__: Posició horitzontal en píxels
  * __y__: Posició vertical en píxels
  * __alpha__: Proporció de cada imatge.

```python
img1 = cv.imread("path/to/file")
img2 = cv.imread("path/to/file")

overlay = overlay(img1, img2, 500, 100)
cv.imshow("Overlay", overlay)

blending = overlay(img1, img2, 500, 100, alpha=0.5)
cv.imshow("Blending", blending)
```