In this activity we will build the Hua Ki’i prototype AI image recognition app and consider what is required to adapt it with another language community.
Hua Ki’i was developed during a series of Indigenous Protocols and Artificial Intelligence workshops in 2019, by a team of Indigenous engineers, scholars and language activists. The app was built with a goal of developing an Indigenous language revitalisation tool. Hua Ki’i uses an object recognition system and translates the image detection results into Hawaiian. The app is easily remixed for another language, and is a great platform for exposing the bias in AI.
For more information about the background to Hua Ki’i and an introduction to Indigenous Protocols, read the Indigenous AI position paper. https://www.indigenous-ai.net/position-paper/
The activity is in two parts. First build the app, then train an object detection system to gain insight into the limitations that current technologies have for recognising culturally specific objects for many of the world’s communities.
What are some things to consider if wanting to make this with an Indigenous community in Australia?
The prototype app is limited in recognising particular objects from common categories. A bias in AI is that the categories and objects that are recognised are not always culturally relevant for a community who wants to build their own.
What is required to develop a localised object detection system? Many of the considerations here apply for speech/sign recongition, translation and language generation.
Publicly available datasets:
COCO
CIFAR
Clean and high-quality data is critical. Some tools exist for finding annotation mistakes, verifying models and looking at subsets of data. Eg. https://voxel51.com/docs/fiftyone/
Datasets can be created from scratch by taking photos/scanning culturally relevant objects. Another approach is to bulk download images from online services such as Google/Flickr.
Training object detection typically requires matching image sizes in the training data. Bulk resizing is required using tools such as Imagemagick.
A dataset of images requires labelling/annotation with polygon or rectangle shapes before training.
Read about different types of image annotation.
There are many tools that can be used to label images, including:
FastAnnotationSingleObject
Claims being able to label approx 6000 images over 8hours, is this realistic? Raccoon dataset was 2 hrs for 200 images.
labelImg is a classic
Make sense
Great interface, YOLO format is not v5 so would need to be converted. Here’s a blog about it.
Roboflow
Has a good annotation/training interface, some existing datasets, exports in YOLOv5 format
In the prac we won’t have time to label a large dataset, but let’s try downloading some images from Google and label them using Roboflow.
Use the Colab to train YOLOv5 with the racoon data.
If you are super keen, try training a koala recogniser! What would be involved? How long do you think it would take?
python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt
Mean average precision https://blog.roboflow.com/mean-average-precision/
https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb#scrollTo=Knxi2ncxWffW
https://medium.com/analytics-vidhya/train-a-custom-yolov4-object-detector-using-google-colab-61a659d4868
https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9
https://blog.roboflow.com/how-to-train-yolov5-on-a-custom-dataset/